Human Mutation
○ Wiley
Preprints posted in the last 90 days, ranked by how well they match Human Mutation's content profile, based on 14 papers previously published here. The average preprint has a 0.06% match score for this journal, so anything above that is already an above-average fit.
De, T.; Faruq, M.
Show abstract
Hereditary ataxias are complicated neurological disorders with enormous genetic heterogeneity as well as the diverse genetic mechanism. Among different genetic mechanism, tandem nucleotide repeat expansion (TNRex) are the most common cause for genetic ataxias followed by single nucleotide variations in over 200 genes. The detection and the diagnosis of tandem nucleotide repeats in clinics and laboratories has been at large common in comparison with SNVs owing to the large number of the mutations in the respective genes they are found. The widely used platforms for detection of these mutations are capillary electrophoresis and Next generation sequencing based targeted gene panel or clinical or whole exome sequencing. Long read sequencers have been proven useful for detection of tandem nucleotide repeat expansions. We have evolved a method to detect in one experiment and on single platform the detection of TNRex and SNVs on Oxford Nanopre Technology using adaptive sequencing approach. We were able to optimize the target region sequencing of both TNR loci and SNV-loci and validate the capture of both by detection of FXN-GAA repeats and pathogenic SNVs in SETX
Matton, C.; Van De Velde, J.; De Bruyne, M.; Van De Sompele, S.; Hooghe, S.; Syryn, H.; Bauwens, M.; D'haene, E.; Dheedene, A.; Cools, M.; Komatsuzaki, S.; Preizner-Rzucidlo, E.; Ross, A.; Armstrong, C.; Watkins, W.; Shelling, A.; Vincent, A. L.; Cassiman, C.; Vermeer, S.; Bunyan, D. J.; Verdin, H.; De Baere, E.
Show abstract
Heterozygous FOXL2 (non-)coding sequence and structural variants (SVs) lead to blepharophimosis, ptosis and epicanthus inversus syndrome (BPES), a rare, autosomal dominant developmental disorder characterized by a completely penetrant eyelid malformation and incompletely penetrant primary ovarian insufficiency (POI). We collected variants from our in-house database, generated via clinical genetic testing and downstream research testing in the Center for Medical Genetics Ghent, Belgium (2001-2024), and via literature and other resources in the same period. All retrieved variants were categorized using ACMG/AMP classifications to increase the knowledge of pathogenicity. We collected 413 unique genetic defects of the FOXL2 region, including 76 novel variants, in 864 index patients. Of these, 87% of patients were identified with a coding FOXL2 sequence variant. The polyalanine tract is a known mutational hotspot of FOXL2, illustrated here by the high percentage of pathogenic polyalanine expansions (24%). Furthermore, the molecular spectrum in typical BPES index patients is characterized by 8% coding deletions and 3% deletions located up- and downstream of FOXL2. The remaining 2% carry translocations along with chromosomal rearrangements of 3q23. This uniform and structured reclassification, incorporating the largest dataset of variants implicated in FOXL2-associated disease so far, will improve both the diagnosis as well as genetic counselling for individuals with BPES.
Grimwade, I. J.; Fasham, J.; Wright, C. F.; Jackson, L.
Show abstract
Severe combined immunodeficiency (SCID) is a heterogeneous, recessive disorder, associated with the onset of severe, recurrent infections in the first few months of life. SCID is fatal if left untreated, but outcomes can be significantly improved by prompt diagnosis and treatment, particularly prior to onset of infection. Consequently, SCID is already included in many newborn screening programmes around the world, as well as multiple international genomic newborn screening (gNBS) research programmes. However, there is a vital need to estimate penetrance of SCID variants in population cohorts, to mitigate the potential consequences of reporting low penetrance variants in a genotype-first gNBS setting. This study aimed to assess the penetrance and prevalence of these variants in the UK Biobank population cohort. Whole genome sequencing data from 490,640 individuals was used to interrogate 16 SCID genes for potentially causal variation. We identified 4206 carriers of single heterozygous pathogenic variants ([~]1% of cohort), but only 6 individuals double heterozygous, homozygous or hemizygous for relevant pathogenic variants. 3 individuals would be expected to require further testing had they been identified by gNBS, suggesting that fewer than 1 in 100,000 newborns might require follow-up testing due to SCID variants. Following detailed variant curation, we were able to identify only 2 unabected individuals likely to be harbouring biallelic pathogenic variants, potentially indicative of reduced penetrance. Nonetheless, SCID remains an excellent candidate for inclusion in gNBS studies, due its severity, clinical actionability and expected low false positive rate, although care should be taken when reporting hypomorphic variants.
Neurgaonkar, P.; Dierolf, M.; O'Gorman, L.; Remmele, C.; Schaeffer, J.; Popp, I.; Borst, A.; Rost, S.; Ankenbrand, M.; Kratz, C.; Bergmann, A.; Kalb, R.; Yu, J.
Show abstract
MotivationFanconi anemia (FA) is a rare disease mainly caused by biallelic pathogenic variants, including structural variants such as large deletions and insertions in FA genes. Currently, variant detection is based on short-read sequencing and probe-based approaches. However, determining the exact genomic breakpoint or achieving allelic discrimination remains challenging. Nanopore-based long-read sequencing enables a comprehensive detection of FA variants, but a unified bioinformatic analysis platform for these data is missing. ResultsWe present FA-NIVA (Fanconi anemia - Nanopore Indel and Variant Analysis), an automated and adaptable analysis workflow tailored for Nanopore-based long-read sequencing data in FA genetic analysis. FA-NIVA integrates state-of-the-art tools to comprehensively detect both single nucleotide variants (SNVs) and structural variants (SVs). Our analysis platform enhances genotyping accuracy for biallelic variants by a joint SNV-SV based phasing in FA associated genes. Built within the Nextflow ecosystem and powered by containerized Docker images, FA-NIVA ensures reproducibility, flexibility, scalability and transparency across different computing environments. Together, FA-NIVA provides a robust end-to-end solution for the automated analysis of SVs and SNVs and high-resolution phasing analysis in FA genes, enabling an accurate and efficient pipeline for genetic analysis. AvailabilityFA-NIVA is available on GitHub at: https://github.com/UKWgenommedizin/FA-NIVA.
Louw, N.; Makay, P.; Mpangase, P.; Naicker, T.; Yates, L.; Honey, E.; Mbungu, G.; Van Den Bogaert, K.; Firth, H.; Hurles, M.; Lukusa, P.; Devriendt, K.; Krause, A.; Carstens, N.; Lumaka, A.; Lombard, Z.
Show abstract
Copy number variants (CNV) contribute significantly to the pathogenic variation associated with developmental disorders. CNV detection is often not included in standard exome sequencing (ES) analysis. Complementary methods such as chromosomal microarray are typically offered in diagnostic laboratories to diagnose pathogenic CNV. In this study, we aimed to develop an optimal approach for incorporating CNV detection within our ES analysis process for the Deciphering Developmental Disorders in Africa (DDD-Africa) cohort. We analyzed ES data from 505 probands with a developmental disorder, applying a CNV detection approach that assessed data generated using the tools CANOES and XHMM. When available, parental ES data was used to assess inheritance patterns. We confirmed a diagnosis in 42/505 (8,3%) patients with 44 pathogenic CNV identified in the probands. There were 31 deletions and 13 duplications. Among the 27 probands with parental data, all identified CNV were de novo. The addition of CNV analysis to our ES analysis pipeline resulted in an 8.3% increase in diagnostic yield in the DDD-Africa cohort without additional laboratory cost. This approach offers a feasible approach which is likely to reduce analytical cost and is suitable for low- and middle-income countries where funding and resources for genomic medicine initiatives are limited.
Bennett, J. J.; Laver, T. W.; Mannisto, J. M. E.; Houghton, J. A. L.; De Franco, E.; Kalyon, O.; Wright, S.; Johnson, A.-M.; De Leon, D. D.; Globa, E.; Kummer, S.; Banerjee, I.; Dastamani, A.; International Congenital Hyperinsulinism Consortium, ; Wakeling, M. N.; Johnson, M. B.; Flanagan, S. E.
Show abstract
A substantial proportion of individuals with a well-defined monogenic disorder remain without a genetic diagnosis. Low-level mosaic pathogenic variants are increasingly recognised as an underappreciated cause of monogenic disease but are technically challenging to detect, particularly in organ-specific conditions when affected tissue is inaccessible. We systematically investigated low-level mosaic variants in individuals with congenital hyperinsulinism (CHI: n=1,252) or neonatal diabetes (NDM: n=312), two opposing pancreatic disorders of insulin secretion. We screened for established pathogenic variants with variant allele fraction (VAF) <8% in dominant CHI (ABCC8, GCK, GLUD1, HK1) or dominant NDM (ABCC8, KCNJ11, INS) genes in targeted next generation sequencing (tNGS) data using Mutect2. This called 40 variants across the four genes in 39 individuals with CHI. No candidate variants were found in the NDM cohort. Orthogonal validation of 35 variants using TaqMan-based droplet digital PCR (ddPCR) confirmed 26/35 variants. The median VAF for confirmed variants was 3.6% (1.1-7.8%), while false positives (9/35) predominantly had a VAF <1% with some overlap in VAF with true positives. This study shows that disease-causing low-level mosaic variants in dominant CHI genes can be detected in blood using tNGS but require orthogonal validation. These results provide a framework to improve diagnostic yield in organ-specific conditions where mosaic variants may represent an important missed cause of disease.
DeBortoli, E.; Clinch, T.; Vaz-Goncalves, L.; Burbury, L.; Jeppesen, M.; Pinzon Charry, A.; Melo, M.; Sullivan, A.; Hunter, M.; Peake, J.; McInerney-Leo, A.; McNaughton, P.; Yanes, T.
Show abstract
PurposeWhile genomic testing is integral to pediatric inborn errors of immunity (IEI) care, few studies have examined strategies to support its optimal delivery. This study aimed to characterize a pediatric IEI cohort and assess the impact of implementing a mainstream model-of-care (MoC). Materials/MethodsComprehensive chart audit was conducted for patients ([≤]18y) who received IEI genomic testing in Queensland, Australia, from 2017-2025. Descriptive analyses captured demographic and clinical characteristics, genomic testing and results, and management outcomes. Inferential analyses assessed changes in genomic practices pre-MoC (<2021) and post-MoC ([≥]2021). Results322 patients met eligibility criteria (n=481 genomic test). Diagnostic yield (27.6%) varied by testing indication, with the highest rate among phagocytic defects (n=4/4;100%) and severe combined immunodeficiency (n=8/10;80%). Very-early-onset inflammatory bowel disease had the lowest diagnostic yield (n=3/68;4.4%), prompting changes to testing criteria. Molecular diagnosis resulted in management changes for 90.5% patients. Genomic testing was widely used pre-MoC (n=251 genomic tests). All outcomes significantly improved pre-and post-MoC (p<0.05): duplicate testing decreased (13.9% to 0%); variants of uncertain significance reduced (37.7% to 7.1%); informed consent documentation increased (70.5% to 88.4%); and diagnostic yield increased (16.2% to 27.4%). ConclusionTargeted interventions are needed to support delivery of genomic testing and strengthen service effectiveness.
Spencer, C.; Machado-Paula, L.; Qian, F.; Butali, A.; Buxo, C. J.; Padilla, C.; Restrepo-Muneton, C.; Valencia-Ramirez, C.; Long, R. E.; Weinberg, S.; Marazita, M. L.; Murray, J. C.; Moreno-Uribe, L. M.; Petrin, A. L.
Show abstract
ObjectiveOrofacial clefts may involve the complete vertical thickness of the lip (complete) or partial thickness (incomplete). This study evaluates side preference for completeness in nonsyndromic asymmetric bilateral and unilateral cleft lip with or without cleft palate (NSCL/P). DesignWe studied 4 multiethnic cohorts from North and South America, Asia, and Africa, including 3,561 individuals with NSCL/P. Associations between cleft completeness, sex, ethnicity, and race were assessed using Chi-square or Fishers exact test (=0.05). ParticipantsPatients with NSCL/P with complete information on cleft type and completeness were included. Our main goal was to analyze side preference of complete clefting in different demographic groups, sex and race. ResultsAmongst asymmetric bilateral cases, left side completeness was significantly more frequent than the right side (73.7% vs. 26.3%; p<0.001). No associations observed for sex or race with ethnicity showing a trend toward significance (50.0% vs. 25.5%; p=0.088). Amongst symmetric bilateral and unilateral cases, Hispanics exhibited completeness more frequently than non-Hispanics (96.4% vs 89.5%; p<0.001; 84.1% vs. 79.7%; p<0.001). For unilateral cases, completeness showed no side preference. Caucasians were less likely to exhibit complete clefts compared to Asians, Blacks, or other racial groups (68.7% vs 84.9% or 81.2% or 81.7%; p<0.001). Females more frequently presented with completeness than males (81.2% vs 76.6%; p=0.003). ConclusionsIn NSCL/P with bilateral asymmetry, the left side is more often complete than the right side. Although unilateral left-sided clefts are more common overall, completeness shows no side preference. Race and ethnicity demonstrate significant associations with cleft severity patterns.
Koch, R. L.; Akman, H. O.; Chown, E.; Goldman, D.; Levenson, J.; Lu, Q.; Michalovicz Gill, L. T.; Morgan, M.; Orthmann-Murphy, J.; Pires, N. T.; Reef, R.; Saxe, H.; Singer-Berk, M.; Baxter, S.
Show abstract
Glycogen storage disease type IV (GSD IV) is an autosomal recessive disorder caused by pathogenic variants in GBE1, resulting in deficient glycogen branching enzyme (GBE) activity and formation of abnormal glycogen ("polyglucosan"). GSD IV manifests across a spectrum of clinical dimensions - including hepatic, neurologic, muscular, and cardiac involvement - which vary in severity. The early-onset forms, historically referred to as Andersen disease, present at different stages ranging from in utero to adolescence. The adult-onset form, referred to as adult polyglucosan body disease (APBD), typically presents in middle to late adulthood. To date, no epidemiological study of GSD IV has been performed. Understanding the global prevalence of GSD IV is critical to increase disease awareness, improve diagnostic rates, inform therapeutic development, and engage pharmaceutical companies. In collaboration with the Rare Genomes Project at the Broad Institute of MIT and Harvard and the APBD Research Foundation, this study curated variants in GBE1 and calculated prevalence across nine genetic ancestry groups. The estimated global carrier frequency of GSD IV is 1 in 243 individuals, and the global genetic prevalence is 1 in 235,784 individuals. Based on the 2024 world population, the estimated number of affected individuals with GSD IV is approximately 34,800. These estimates highlight a significant underdiagnosis of GSD IV and underscore the urgent need for increased awareness of this metabolic disorder. This model of collaboration between researchers, patient advocacy organizations, and genetic data sharing programs provides a framework for estimating the prevalence of other rare diseases in the global population. Graphical abstract O_FIG O_LINKSMALLFIG WIDTH=180 HEIGHT=200 SRC="FIGDIR/small/25342386v1_ufig1.gif" ALT="Figure 1"> View larger version (49K): org.highwire.dtl.DTLVardef@1a1ad7dorg.highwire.dtl.DTLVardef@1851576org.highwire.dtl.DTLVardef@442c19org.highwire.dtl.DTLVardef@1ab2ddb_HPS_FORMAT_FIGEXP M_FIG Created in BioRender. Koch, R. (2025) https://BioRender.com/j0sg30n. C_FIG
Carta, M. G.; Angeloni, M.; Toegel, L.; Schubart, C.; Hoelsken, A.; Stoehr, R.; Vatrano, S.; Rizzi, D.; Magni, P.; Fraggetta, F.; Hartmann, A.; Haller, F.; Ferrazzi, F.
Show abstract
Molecular Tumour Boards (MTBs) rely on different bioinformatics tools and knowledgebases for variant annotation, oncogenicity classification, and estimation of complex biomarkers to identify actionable alterations. However, the typical bioinformatics workflow to process raw next-generation sequencing (NGS) data into clinically meaningful variants involves multiple steps and is inherently complex, thus requiring repeated manual intervention and causing delays in providing molecularly informed precision oncology. Here, we aimed at overcoming these limitations by developing a fully-automated integrative workflow to support NGS-based analyses within MTBs. Our workflow was established at the Institute of Pathology, University Hospital Erlangen (Germany), and adapted to the fully digitized Pathology department at Gravina Hospital in Caltagirone (Italy), using the Illumina TruSight Oncology 500 HRD assay as case study. A trigger event initiates all the downstream bioinformatics analyses to support variant interpretation. In Erlangen, the trigger event is the automatic detection of new NGS data on the Illumina Connected Analytics cloud-based platform. In Caltagirone, the analyses are manually triggered from the anatomic pathology laboratory information system (AP-LIS). The workflow automatically: (i) generates an intuitive overview of sequencing quality metrics, (ii) performs variant annotation, (iii) classifies variant oncogenicity through a fully-automated implementation of the ClinGen/CGC/VICC guidelines, and (iv) generates homologous recombination deficiency scores with genomic instability plots. In the digitized pathology department, results can be readily opened from the AP-LIS and visualized in the patient gallery. Taken together, our end-to-end fully-automated workflow streamlines NGS-based analyses within MTBs by integrating variant interpretation, oncogenicity classification, and estimation of clinically relevant biomarkers.
Moreno, G.; Rebolledo-Jaramillo, B.; Böhme, D.; Encina, G.; Martin, L. M.; Zavala, M. J.; Espinosa, F.; Hasbun, M. T.; Poli, M. C.; Faundes, V.; Repetto, G. M.
Show abstract
BackgroundExome sequencing (ES) has become a key diagnostic tool for rare diseases (RDs). However, most evidence on ES performance comes from high-income countries and patients from European ancestry. In countries such as Chile, limited access to next generation sequencing amplifies health disparities and highlights the need to identify which patients are most likely to benefit from ES. MethodsThis study presents the second phase of the Chilean DECIPHERD project, in which we performed ES in a new group of patients with RDs presenting with multiple congenital anomalies (MCA), neurodevelopmental disorders (NDD), and/or suspected inborn errors of immunity. To identify clinical and demographic factors associated with an increased probability of obtaining an informative ES result, we conducted a logistic regression analysis, combining the results of the first and second phases of the project. We also objectively evaluated global ancestry measured using ADMIXTURE, as a potential factor. ResultsSixty-seven patients participated in this second phase of DECIPHERD with a median age of 6 years (range: 0-27); 55.2% were female, with an average ({+/-} s.d.) proportion of Native American ancestry of 0.615 {+/-} 0.18. Clinically, 52.2% presented with both MCA and NDD, and the rest had other phenotype combinations. An informative result, including pathogenic or likely pathogenic variants in genes consistent with the patients phenotype, was identified in 34.3% of the cohort; 61% of these variants had not been previously reported in databases such as ClinVar. By combining the two phases of the study, we reached a total of 167 patients, in whom the presence of NDD and/or MCA significantly increased the probability of achieving an informative ES outcome. In contrast, previous use of gene panel testing was associated with a decreased likelihood of receiving an informative result. Ancestry was not associated with diagnostic yield. ConclusionsThis study demonstrates the utility of ES in achieving a diagnosis in a clinically diverse cohort of Chilean patients with RDs, and characterized features associated with a higher diagnostic yield. These findings may contribute to evidence-based patient prioritization strategies in settings with limited access to NGS resources.
Torr, B.; Mansour, L.; Fierheller, C. T.; Hamill, M.; Nolan, J.; Bell, N.; Choi, S.; Allen, S.; Muralidharan, S.; MacMahon, S.; Clinch, Y.; Valganon-Petrizan, M.; Harder, H.; Garrett, A.; Evans, D. G.; George, A.; Jenkins, V.; Fallowfield, L.; Legood, R.; Kemp, Z.; Manchanda, R.; Turnbull, C.
Show abstract
BackgroundBreast cancer susceptibility gene testing (BCSG-testing) is expanding in relation to both eligibility for testing and number of genes included on testing panels. However, uncertainty remains regarding the most effective testing strategies for identifying clinically actionable germline pathogenic variants (gPVs) while balancing increased burden on breast and genetics clinical services. Patients and MethodsThe North Thames Mainstreaming of Breast Cancer Genetic Testing (NT-MBGT) programme piloted unselected breast cancer (BC) patient BCSG-testing via a clinician-light BRCA-DIRECT mainstreaming pathway. We present real-world evaluation of (i) gPV pick-up rates according to BC characteristics and (ii) operational feasibility, acceptability, and satisfaction with the BRCA-DIRECT expanded testing pathway. ResultsThe BRCA-DIRECT pathway successfully tested 3,517 newly-diagnosed BC patients within 14 National Health Service (NHS) breast oncology units, with high levels of patient and breast healthcare professional (HCP) satisfaction, and genetics HCPs reporting concomitant decrease in service referrals. The overall pick-up rate of gPVs was 4.7%. Current NHS eligibility criteria would have offered testing to 20.6% of patients and identified 49.2% of observed gPVs in high penetrance (HP)-BCSGs (BRCA1/BRCA2/PALB2) and 18.2% of gPVs in intermediate penetrance (IP)-BCSGs (CHEK2/ATM/RAD51C/RAD51D). Ultra-simple eligibility criteria could improve detection (sensitivity) to 74.6% and 61.4%, respectively, whilst increasing testing to 50.2% of BC cases. ConclusionsEvidence from the NT-MBGT programme demonstrates that expanding BCSG-testing via a clinician-light pathway is acceptable and feasible, without increasing the burden on limited breast and genetics workforce, and has high satisfaction. Simplified testing criteria could improve identification of gPVs in HP-BCSGs. The concomitant increased pick-up of gPVs in IP-BCSGs warrants further consideration. highlightsO_LIIn this real-world evaluation we observed the successful rollout of the BRCA-DIRECT streamlined, clinician-light mainstreaming pathway for a pilot of germline breast cancer susceptibility gene testing in 3517 unselected breast cancer patients from 14 regional breast oncology/surgical units. C_LIO_LIPatients undergoing testing via the pathway reported high levels of satisfaction and low decisional regret, with breast and genetics healthcare professionals highly recommending the pathway for mainstream testing. C_LIO_LIDifferences were observed between breast healthcare professionals preferring unselected breast cancer patient testing and genetics healthcare professionals preferring restriction to current national testing criteria due to broader concerns around equity of access to testing. C_LIO_LIWe identified that current national testing criteria would have missed identifying 50.8% of germline pathogenic variants in high-penetrance, clinically actionable genes, likely having implications for treatment and surgical decision-making in the breast cancer patients. C_LIO_LIWe evaluated the performance of two additional approaches for establishing testing eligibility criteria to understand how we could best balance maximising identification of germline pathogenic variants (sensitivity) whilst limiting (unnecessary) testing within the breast cancer patient population (specificity). C_LI
Sikorska, J.; Krawczynski, M. R.; Korwin, M.; Ołdak, M.; Bartnik, E.; Tonska, K.; Piotrowska-Nowak, A.
Show abstract
Leber hereditary optic neuropathy (LHON) is primarily caused by pathogenic mitochondrial DNA (mtDNA) variants, most commonly the m.11778G>A variant in the MT-ND4 gene. The presence of this variant alone is insufficient to trigger disease symptoms, of which vision loss is the hallmark. Given the incomplete penetrance and inter-population variability in modifying factors, this study aimed to investigate two previously proposed genetic risk factors for LHON in the Polish population. Using quantitative PCR, we measured the mtDNA copy number in peripheral blood of affected and unaffected carriers of the m.11778G>A variant. In addition, we assessed the frequency of the PRICKLE3 c.157C>T variant in symptomatic, asymptomatic and control individuals using PCR-RFLP. Our results indicate that neither mtDNA copy number nor the presence of the PRICKLE3 variant is associated with LHON symptom manifestation in the Polish cohort under conditions tested, in contrast to previously reported associations in other populations. These findings suggest that the incomplete penetrance of LHON in the Polish population may involve other modifying factors, such as yet unidentified nuclear DNA variants. Research highlightsO_LIMitochondrial DNA (mtDNA) copy number and the presence of the c.157C>T variant in the PRICKLE3 gene do not influence the manifestation of Leber hereditary optic neuropathy (LHON) symptoms in the Polish population. C_LIO_LIThe results support a geographic dependence of genetic risk factors affecting the penetrance of LHON-associated mtDNA variants. C_LI
Connelly, E.; Laraway, B.; Mullen, K. R.; Mungall, C. J.; Haendel, M. A.; Hurwitz, E.
Show abstract
Fanconi anemia (FA) is a rare genetic disorder of impaired DNA repair characterized by progressive bone marrow failure, congenital malformations, and cancer predisposition. Early identification of individuals with FA is critical for timely clinical management, yet phenotype-driven approaches to FA identification are hindered by inconsistencies in existing phenotypic profiles. We compared the Human Phenotype Ontology (HPO) annotations for FA in OMIM (215 terms across 22 complementation group entries) and Orphanet (106 terms in a single entry, ORPHA:84), quantifying overlap and anatomical system coverage. To address identified gaps, we developed a comprehensive custom HPO profile by extracting phenotypic terms from the entire Fanconi Cancer Foundation (FCF) Clinical Care Guidelines using OntoGPT, an LLM-based ontology extraction tool, followed by manual curation to ensure accuracy and clinical relevance. OMIM and Orphanet shared only 36 HPO terms (12.6% of their combined 285 unique terms), demonstrating substantial discordance. Our custom profile comprises 264 unique HPO terms, of which 161 (61.0%) are novel and not present in either existing source. The novel terms expand coverage particularly in musculoskeletal (39 terms, 23.8%), genitourinary (26 terms, 15.9%), limb (26 terms, 15.9%), head or neck (20 terms 12.2%), and digestive system (17 terms, 10.4%) phenotypes. Community-curated phenotypic profiles derived from clinical practice guidelines can substantially augment existing disease annotations. Our FA profile, the most comprehensive HPO-based phenotypic characterization of FA to date, is publicly available and provides a foundation for improved clinical decision support and EHR-based computable phenotyping that can accelerate diagnosis for individuals with FA. Furthermore, the LLM-assisted approach offers generalizable methods to improve the diagnostic odyssey for all rare diseases.
Kovanda, A.; Hodzic, A.; Kotnik, U.; Visnjar, T.; Podgrajsek, R.; Andjelic, A.; Jaklic, H.; Maver, A.; Lovrecic, L.; Peterlin, B.
Show abstract
STUDY QUESTION[Do structural genomic variants, that can be identified by using optical genome mapping, contribute to male infertility?] SUMMARY ANSWER[By using optical genome mapping we can identify several types of structural variants, both known and new, that may contribute to male infertility.] WHAT IS KNOWN ALREADY[Traditional approaches such as karyotyping, CFTR and chromosome Y microdeletion testing are successful in explaining clinical findings in [~]30% of MI patients, leaving the rest without a genetic diagnosis. Recent research suggests at least 265 genes may play a role in male fertility. While the assessment of the roles of copy number variants and single nucleotide variants in monogenic forms of disease in these genes is underway, much less is known about structural variants.] STUDY DESIGN, SIZE, DURATION[We performed a longitudinal case/control study on a total of 220 individuals; 88 patients with male infertility, negative for cytogenetic abnormalities using karyotyping, and molecular testing for chrY microdeletions, and CFTR gene variants, and 132 healthy male individuals that underwent optical genomic mapping for other reasons. Exclusion criteria for the control cohort were low-sperm quality and/or inclusion in IVF procedures. The study was approved by the National Medical Ethics Committee of the Republic of Slovenia (reference number: 0120-213/2022/6). Optical genome mapping was performed from an aliquot of whole blood collected for routine testing purposes at the Clinical Institute of Genomic Medicine (CIGM), UMC Ljubljana from January 2023 to November 2024.] PARTICIPANTS/MATERIALS, SETTING, METHODS[We examined structural variants in 220 participants by using optical genome mapping, which was performed with DLE-1 SP-G2 chemistry and the Saphyr instrument. The de novo assembly and Variant Annotation Pipeline were executed on Bionano Solve3.7_20221013_25 while reporting and direct visualization of structural variants was done on Bionano Access 1.7.2. All obtained variants were filtered using the Bionano Access software and in-house generated gene/regions of interest panel bed files. The first filter was applied to include variants below a population frequency of 10%, and overlapping the regions of interest. Subsequently, all variants occurring with frequency 0% in the internal manufacturer variant dataset were manually evaluated for possible involvement of the overlapping genes or regions in biological processes involved in MI. The male infertility cohort also underwent research whole exome analyses as previously reported. All results of optical genomic mapping were confirmed by an appropriate alternative method where available.] MAIN RESULTS AND THE ROLE OF CHANCE[We show that the overall number of structural variants in MI patients does not differ from that of healthy individuals. By looking in detail at genes and regions associated with MI, we identified 21 rare variants absent from controls in 25.0 % of MI patients, of which five were likely causative, and two would be missed by using traditional approaches. These variants include inversions, duplications, amplifications, deletions (e.g. SPAG1), and insertions/expansions (e.g. DMPK), that were validated using additional methods. While the remaining SV cannot be currently classified as pathogenic according to existing criteria, they open a new avenue in genetic research of MI. LARGE SCALE DATA[Variants reported in this study were deposited into ClinVar under accession numbers SUB15650956 (https://www.ncbi.nlm.nih.gov/clinvar/)] LIMITATIONS, REASONS FOR CAUTION[Technical limitations of optical genome mapping include the lack of DLE-1 labelling of centromeric and telomeric regions, the inability to detect Robertsonian translocations, the unclear exact location of smaller structural variants located between the DLE-1 labels, and unclear boundaries in case of their location in segmentally duplicated regions (this limitation is shared with other methods). The ACGM criteria of rarity are also hard to apply, as the fertility status of the individuals in healthy population databases such as GnomAD and DGV is unknown. Similarly, gene-associated phenotype and the proposed inheritance model both need to be considered as parts of the ACMG criteria, but for many candidate genes associated with MI, no model of inheritance has yet been proposed.] WIDER IMPLICATIONS OF THE FINDINGS[Currently, with the established diagnostic approaches we are able to resolve [~]30% of male infertility cases, with [~]70% of patients remaining undiagnosed. The significance of our work is in showing that rare structural variants can be identified in MI, by using optical genome mapping, opening new avenues of research of the genetics of this important contributor to human fertility.] STUDY FUNDING/COMPETING INTEREST(S)[All authors declare having no conflict of interest in regard to this research. This work was funded by the Slovenian Research and Innovation Agency (ARIS) Programme grant P3-0326: Gynecology and Reproduction: Genomics for personalized medicine] Lay summaryMale infertility affects about 5% of adult males and has complex causes, including genetic ones, such as mutations in the CFTR gene, small deletions on chromosome Y, and balanced translocations, but currently we can only find a genetic cause in [~]30% of patients. This means [~]70% of cases remain undiagnosed but potentially, they too may have a yet unknown genetic cause. Indeed, so far research has shown at least 265 genes have been proposed to play a role in male fertility. In these genes, there has so far been limited research of single nucleotide variants and of copy number variants, but many structural variants are not visible using commonly used methods in clinical genetic testing. Therefore, apart from chromosome Y microdeletions and chromosomal numerical and structural anomalies, such as balanced translocations, the role of smaller structural variants in male infertility is unknown, but based from what we know from other diseases, they also may play a role in male infertility. Optical genome mapping is a novel method for the detection of structural variants, such as balanced and unbalanced translocations, insertions, duplications, deletions, and complex structural rearrangements in a wide range of sizes. By using optical genome mapping to test a cohort of 88 infertile men and 132 healthy controls, we aimed to provide the first insights into the range of SV that may be associated with MI. We found, by using optical genome mapping, the overall number of structural variants in MI patients not to be significantly different to the control group. However, by looking at genes and regions associated with MI, we can find rare structural variants that are absent from controls in 25.0% of MI patients. These variants include inversions, duplications, amplifications, deletions (e.g. deletion in SPAG1), and insertions/expansions (e.g. in DMPK), that were validated using additional methods. Five of these variants (5.6%) were likely causative, and two would be missed by traditional approaches. While the remaining SV cannot be currently classified as pathogenic according to existing criteria, they open a new avenue in genetic research of MI.
Destouni, A.; Uuskula, B.; Lanillos, J.; Teder, H.; Paluoja, P.; Metsvaht, T.; Rodriguez-Antona, C.; Salumets, A.; Krjutshkov, K.
Show abstract
Irreversible profound hearing loss in early childhood impairs severely the development of spoken language, behavior and cognition. Hearing loss caused by aminoglycoside antibiotics in neonates treated for sepsis in intensive care units is linked to variants in the MT-RNR1 gene. Identifying the population at risk in acute medical settings is substantially limited by genotyping restricted to m.1555A>G only with 20% failure rate of the currently approved point-of-care test. We report an innovative prenatal pharmacogenetic approach based on the parallel genome-wide analysis of mitochondrial and nuclear cell-free DNA which co-exist in routine non-invasive prenatal testing (NIPT) sequencing data. Following analysis of 5,529 NIPT cases, we reached to 99.3% cumulative call rate with100% sensitivity and specificity for the clinically actionable variants m.1095T>C, m.1494C>T and m.1555A>G. Since NIPT is a globally adopted first and second-tier prenatal test, our approach could revolutionize early intervention strategies for aminoglycoside-induced hearing loss and improve clinical decision-making.
Buianova, A. A.; Cheranev, V. V.; Shmitko, A. O.; Vasiliadis, I. A.; Ilyina, G. A.; Suchalko, O. N.; Kuznetsov, M. I.; Belova, V. A.; Korostin, D. O.
Show abstract
IntroductionAdverse drug reactions (ADRs) remain a major public health issue, and genetic factors contribute importantly to interindividual variability in drug response. Pharmacogenetic testing helps reduce ADR risk by optimizing drug selection and dosage, particularly in monogenic disorders. Material and MethodsWhole-exome sequencing of 6,739 samples from the Russian population was performed using the MGIEasy Universal DNA Library Prep Set on the DNBSEQ-G400 platform (MGI). Variants in 48 genes were examined, focusing on inherited arrhythmias (Long QT syndrome, Short QT syndrome, Timothy syndrome, Andersen-Tawil syndrome, Brugada syndrome, Atrial fibrillation, Catecholaminergic polymorphic ventricular tachycardia), enzyme deficiencies (Glucose-6-Phosphate Dehydrogenase Deficiency [G6PDD], Porphyrias), Dravet Syndrome (DS) and Malignant Hyperthermia (MH). All identified variants had been reported at least once as pathogenic (P) or likely pathogenic (LP) in ClinVar, along with those occasionally classified as variants of uncertain significance (VUS). Each variant was manually re-evaluated according to ACMG criteria. ResultsA total of 75 unique variants in 18 genes were observed in 119 individuals (1.77%), including 21 carriers and 13 women with a G6PD mutation. Of these, 46 variants were classified as P, 21 as LP, and 8 as VUS. Missense variants accounted for the largest proportion (73.33%). The most affected genes were KCNQ1 (24/119), which exhibited the highest number of unique variants (18), G6PD (20/119), SCN1A (15/119), and RYR1 (14/119). Regarding associated conditions, mutations linked to arrhythmias were found in 51 individuals, MH in 27, G6PDD in 20, DS in 15, and Porphyrias in 6. ConclusionsIncorporating genetic information on both common and rare clinically actionable variants into therapeutic decision-making has the potential to improve medication safety, reduce preventable ADRs, and enhance the effectiveness of personalized pharmacotherapy.
Yepez, V. A.; Luknarova, R.; Beijer, D.; Estevez-Arias, B.; Mei, D.; Morsy, H.; Mueller, J. S.; Polavarapu, K.; Demidov, G.; Doornbos, C.; Ellwanger, K.; Krass, L.; Laurie, S.; Matalonga, L.; Abdelrazek, I. M.; Astuti, G.; Bisulli, F.; Brechtmann, F.; Dabad, M.; Denomme Pichon, A. S.; Drakos, M.; Eddafir, Z.; Garrabou, G.; Guerrini, R.; Johari, M.; Kegele, J.; Kilicarslan, O. A.; Koelbel, H.; Kolen, I. H. M.; Licchetta, L.; Lochmueller, H.; Maassen, K.; Macken, W.; Mertes, C.; Milisenda, J. C.; Minardi, R.; Mostacci, B.; Neveling, K.; Oud, M. M.; Park, J.; Pujol, A.; Roos, A.; Sagath, L.; van
Show abstract
RNA sequencing (RNA-seq) provides a powerful complement to DNA sequencing for uncovering pathogenic defects affecting gene expression and splicing in individuals with genetically undiagnosed rare disorders. However, as large rare disease consortia adopt RNA-seq, challenges arise due to cohort heterogeneity, variability in tissues and sample sizes, and differences in interpretation practices. Here, we present a harmonized analytical and interpretation framework developed by the pan-European Solve-RD consortium to address these challenges. We analyzed 521 RNA-seq samples from whole blood, fibroblasts, muscle and peripheral blood mononuclear cells collected across more than 30 clinics and five European Reference Networks. Aberrant expression and splicing events were identified using OUTRIDER and FRASER 2.0 and analysed through a standardized four-level scoring framework that encompassed RNA-seq outlier reliability, phenotype relevance, variant mechanism, and segregation evidence, captured in structured reports for interpretation. Regular meetings, and collaborative "Solvathon" workshops were used to evaluate variant pathogenicity. This effort resulted in molecular diagnoses for 19 families out of 248 (7.7%) for whom DNA analyses had been inconclusive. Furthermore, three cases diagnosed using DNA analyses were confirmed, and 49 candidate events and five novel candidate disease genes were identified in the remaining families. Our results demonstrate the feasibility and impact of large-scale, standardized RNA-seq analysis in a transnational research setting. This framework provides a model for other international initiatives such as the Undiagnosed Diseases Network and ERDERA, paving the way for broader clinical implementation of transcriptome-based rare disease diagnostics.
Radjasandirane, R.; Cretin, G.; Diharce, J.; de Brevern, A. G.; Gelly, J.-C.
Show abstract
Predicting the pathogenic impact of missense variants is essential for understanding and diagnosing genetic diseases. These approaches have undergone significant evolution, with the latest methodologies based on deep learning approaches. Nonetheless, only a limited number use the potential of Protein Language Models (PLMs), which have demonstrated strong performance across various protein-related tasks. A new predictor, called PATHOS, was developed; it combines embeddings from an optimal set of two PLMs, namely ESM C 600M and Ankh 2 Large. Their embeddings were combined with additional crucial biological features such as phylogenetic probabilities, allele frequency, and protein annotations; they were aggregated using a fully connected layer architecture. Compared to 65 other predictors on clinical data, PATHOS outperforms state-of-the-art performance. It achieves a Matthews Correlation Coefficient (MCC) of 0.591 on a manually and carefully curated clinical dataset and 0.826 on a ClinVar dataset, surpassing other leading tools. Furthermore, case studies on the progesterone receptor and the KCNQ1 ion channel illustrate that PATHOS can identify functionally critical regions and known pathogenic mutations missed by other leading predictors like AlphaMissense. To ensure broad accessibility and facilitate use by non-specialists, a user-friendly web server containing a database of 140 millions precomputed predictions from human protein from Swiss-Prot was provided. The web server is available at: https://dsimb.inserm.fr/PATHOS/
Dilliott, A. A.; Iacoangeli, A.; Project MinE ALS Sequencing Consortium, ; Al-Chalabi, A.; Al Khleifat, A.; Farhan, S. M. K.
Show abstract
ATXN2 expansions of [≥]33 CAG-repeats are associated with spinocerebellar ataxia type 2, while "intermediate" length expansions have been associated with amyotrophic lateral sclerosis (ALS). Yet, no consensus is established regarding the lengths that define a true association with ALS risk, with recent studies debating between a lower limit of [≥]29 or [≥]31 repeats. Here, we assessed the risk of ALS imparted by various ATXN2 repeat lengths to establish an accepted lower limit of repeats that impart risk of disease in the largest meta-analysis to-date. We identified 19 studies with carrier counts of expansions ranging from 24 to [≥]34 repeats in cohorts of individuals with ALS and controls that we meta-analysed with the ATXN2 repeat lengths of the large-scale Project MinE ALS Consortium dataset (total individuals with ALS = 19202; total controls = 22177) and determined a lower limit of 30 repeats defining significant ALS risk. These findings were validated with a secondary assessment of the individuals with ALS captured within the meta-analysis using the gnomAD short tandem repeat dataset as a proxy control cohort. We also applied our defined ATXN2 repeat risk threshold to explore relationships with ALS clinical outcomes. While we did not observe a significant relationship between ATXN2 repeat lengths and age of ALS onset, we did identify a significant inverse correlation between ATXN2 repeat lengths as a continuous metric and duration of disease and found that individuals with ALS carrying the risk variant allele of [≥]30 repeats had significantly shorter times to diagnosis than those without the repeat expansion. Our comprehensive analyses propose a lower-limit threshold of [≥]30 ATXN2 trinucleotide repeats in length defining true ALS risk. These findings are imperative for allowing improved accuracy in risk interpretation and guidance for patients and their families, particularly as clinical genetic testing efforts continue to expand and there becomes increased need for guiding targeted clinical trial inclusion criteria. Our results may also aid in future analyses assessing ATXN2 pathogenic mechanisms and therapeutic strategies.